Spectral Clustering with Two Views

نویسنده

  • Virginia R. de Sa
چکیده

In this paper we develop an algorithm for spectral clustering in the multi-view setting where there are two independent subsets of dimensions, each of which could be used for clustering (or classification). The canonical examples of this are simultaneous input from two sensory modalitites, where input from each sensory modality is considered a view, as well as web pages where the text on the page is considered one view and text on links to the page another view. Our spectral clustering algorithm creates a bipartite graph and is based on the “minimizing-disagreement” idea. We show a simple artifically generated problem to illustrate when we expect it to perform well and then apply it to a web page clustering problem. We show that it performs better than clustering in the joint space and clustering in the individual spaces when some patterns have both views and others have just one view. Spectral clustering is a very successful idea for clustering patterns. The idea is to form a pairwise affinity matrix A between all pairs of patterns, normalize it, and compute eigenvectors of this normalized affinity matrix (graph Laplacian)L. It can be shown that the second eigenvector of the normalized graph Laplacian is a relaxation of a binary vector solution that minimizes the normalized cut on a graph (Shi & Malik, 1998; J.Shi & Malik, ; Meila & Shi, 2001; Ng et al., 2001). Spectral clustering has the advantage of performing well with non-Gaussian clusters as well as being easily implementable. It is also non-iterative with no local minima. The Ng,Jordan,Weiss(Ng et al., Appearing in Proceedings of the Workshop on Learning with Multiple Views, 22 ICML, Bonn, Germany, 2005. Copyright 2005 by the author(s)/owner(s). 2001) (NJW) generalization to multiclass clustering (which we will build on) is summarized below for data patterns xi to be clustered in to k clusters. • Form the affinity matrix A(i, j) = exp(−||xi − xj ||/2σ) • Set the diagonal entries A(i, i) = 0 • Compute the normalized graph Laplacian as L = D−.5AD−.5 where D is a diagonal matrix with D(i, i) = ∑ j A(i, j) • Compute top k eigenvectors of L and place as colums in a matrix X • Form Y from X by normalizing the rows of X • Run kmeans to cluster the row vectors of Y • pattern xi is assigned to cluster α iff row i of Y is assigned to cluster α In this paper we develop an algorithm for spectral clustering in the multi-view setting where there are two independent subsets of dimensions, each of which could be used for clustering (or classification). The canonical examples of this are multi-sensory input from two modalities where input from each sensory modality is considered a view as well as web pages where the text on the page is considered one view and text on links to the page another view. Also computer vision applications with multiple conditionally independent sensor or feature vectors can be viewed in this way. 1. Algorithm Development Our spectral multi-view algorithm is based on ideas originally developed for the (non-spectral) MinimizingDisagreement algorithm (de Sa, 1994a; de Sa & Ballard, 1998). The idea behind the MinimizingDisagreement (M-D) algorithm is that two (or more)

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Co-regularized Multi-view Spectral Clustering

In many clustering problems, we have access to multiple views of the data each of which could be individually used for clustering. Exploiting information from multiple views, one can hope to find a clustering that is more accurate than the ones obtained using the individual views. Often these different views admit same underlying clustering of the data, so we can approach this problem by lookin...

متن کامل

Multiple Non-Redundant Spectral Clustering Views

Many clustering algorithms only find one clustering solution. However, data can often be grouped and interpreted in many different ways. This is particularly true in the high-dimensional setting where different subspaces reveal different possible groupings of the data. Instead of committing to one clustering solution, here we introduce a novel method that can provide several non-redundant clust...

متن کامل

Guided Co-training for Large-Scale Multi-View Spectral Clustering

In many real-world applications, we have access to multiple views of the data, each of which characterizes the data from a distinct aspect. Several previous algorithms have demonstrated that one can achieve better clustering accuracy by integrating information from all views appropriately than using only an individual view. Owing to the effectiveness of spectral clustering, many multi-view clus...

متن کامل

Integration of Single-view Graphs with Diffusion of Tensor Product Graphs for Multi-view Spectral Clustering

Multi-view clustering takes diversity of multiple views (representations) into consideration. Multiple views may be obtained from various sources or different feature subsets and often provide complementary information to each other. In this paper, we propose a novel graph-based approach to integrate multiple representations to improve clustering performance. While original graphs have been wid...

متن کامل

Learning Social Circles in Ego Networks based on Multi-View Social Graphs

Automatic social circle detection in ego-networks is becoming a fundamentally important task for social network analysis, which can be used for privacy protection or interest group recommendation. So far, most studies focused on how to detect overlapping circles or how to perform detection using both network structure and its node profiles. This paper asks an orthogonal research question: how t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007